RinsMatch: a suggestion-based instance matching system in RDF Graphs
نویسندگان
چکیده
Introduction. In this paper, we present RinsMatch (RDF Instance Match), a suggestion-based instance matching tool for RDF graphs. RinsMatch utilizes a graph node similarity algorithm and returns to the user the subject node pairs that have similarities higher than a defined threshold. If the user approves the matching of a node pair, the nodes are merged. Then more instance matching candidate pairs are generated and presented to the user based on the common predicates and neighbors of the already matched nodes. RinsMatch then reruns the similarity algorithm with the merged RDF node pairs. This process continues until there is no more feedback from the user and the similarity algorithm suggests no new matching candidate pairs. In our previous study [1], we proposed an algorithm for computation of entity similarities of an RDF graph using graph locality, neighborhood similarity, and the Jaccard measure. In the current study we use the proposed RDF entities similarity algorithm for pairing entities which may be merged if approved by the user. We make a similar assumption like the similarity flooding (SF) algorithm proposed in [2], that elements of two graphs are similar when their adjacent elements are similar. Comparing to SF, our technique requires more user interactions and more iterations for computation of entity similarity, but each time the similarity algorithm runs, it produces more accurate results assuming the user provided accurate feedback. Also, merging the RDF nodes reduces the size of the input data graph that the algorithm operates on, yielding less complexity each time.
منابع مشابه
An unsupervised instance matcher for schema-free RDF data
This article presents an unsupervised system that performs instance matching between entities in schema-free Resource Description Framework (RDF) files. Rather than relying on domain expertise or manually labeled samples, the system automatically generates its own heuristic training set. The training sets are first used by the system to align the properties in the input graphs. The property ali...
متن کاملMaking More Wikipedians: Facilitating Semantics Reuse for Wikipedia Authoring
Wikipedia, a killer application in Web 2.0, has embraced the power of collaborative editing to harness collective intelligence. It can also serve as an ideal Semantic Web data source due to its abundance, influence, high quality and well-structuring. However, the heavy burden of up-building and maintaining such an enormous and ever-growing online encyclopedic knowledge base still rests on a ver...
متن کاملOne Size Does not Fit All: When to Use Signature-based Pruning to Improve Template Matching for RDF graphs
Signature-based pruning is broadly accepted as an effective way to improve query performance of graph template matching on general labeled graphs. Most techniques which utilize signature-based pruning claim its benefits on all datasets and queries. However, the effectiveness of signature-based pruning varies greatly among different RDF datasets and highly related to their dataset characteristic...
متن کاملTaming Subgraph Isomorphism for RDF Query Processing
RDF data are used to model knowledge in various areas such as life sciences, Semantic Web, bioinformatics, and social graphs. The size of real RDF data reaches billions of triples. This calls for a framework for efficiently processing RDF data. The core function of processing RDF data is subgraph pattern matching. There have been two completely different directions for supporting efficient subg...
متن کاملAn Approach for Semantic Search by Matching RDF Graphs
The World Wide Web is developing rapidly, but neither recall nor precision of traditional search engines can satisfy the increasing demands of users. Presently, RDF is widely accepted as a standard for semantic representation of information on the Web, which makes possible the advanced search among web resources. In this paper, we introduce an approach for semantic search by matching RDF graphs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015